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Exhibit D 

DATE SUBMITTED: ^JPflt 

SUGGESTED TITLE FOR THE INVENTION: Phased Array Sound System 

BRIEF DESCRIPTION OF THE INVENTION: A sound reproduction method that provides a means of 
controlling sound intensity (volume) and program content at arbitrary locations within a given listening 
space. 

DETAILED DESCRIPTION OF THE INVENTION: 

Overview: 

A sound reproduction method employing multiple loud speakers (or ultrasonic transducers as described 
in an alternate embodiment) and digital signal processing technology to produce phased array sound 
fields where the intensity of the sound at any location in the defined listening space can be controlled by 
constructive superposition of the sound waves emanating from the loud speaker or transducer array. 
Constructive superposition of the sound waves is achieved by using a digital signal processor (DSP), or 
other suitably fast embedded processor system, to insert variable delays into the input signal going into 
specific speakers in the array. The delays are calculated using the distance of each speaker or 
transducer to a specific target location within the listening space. Using the distance to the target and the 
assumed average speed of sound at room temperature and room pressure, the maximum time of flight 
for the sound signal of the farthest speaker from the target in the array is calculated, then the exact 
amount of additional cfelay (within given tolerances) is inserted into the input signal for each closer 
speaker in the array to the target This method achieves the result that the wave front from each speaker 
arrives at the target at the same time and roughly in phase. Due to superposition the amplitudes of 
wavefrontswill algebraically add. Since sound intensity or volume is a function of the square of the 
signal amplitude, very significant sound intensities at the target can be achieved for reasonable sized 
arrays. An array of 10x10 toud speakers has the theoretical limit of achieving an increase of intensity at 
the target of 1 0000 times or 40dB over the volume of any one. speaker as heard from the same location. 
The sound intensity at locations other than the target location is a function of sound addition that is not in 
phase but randomly distributed in phase and timing and is proportional to the number of speakers or 
transducers in the array rather than the square of the number of speakers or transducers. The resulting 
difference in sound intensity levels that might be achieved, if for instance each speaker were to be 
operated at a sound level of OdB or the threshold of audibility would be that at everywhere in the listening 
space except the target, the outof-phase and unintelligible signal would have a sound volume of a feint 
whisper or the sound of rustling leaves while at the target a fully intelligible,, focused and in-phase signal 
would have a sound intensity level nearly equal to that of nonfial conversation. The tariget within the 
listening area can be thought of as focal point for a parabolic or elliptical mirror. As proof of this concept, 
a mechanical analog of this method exists in the form of a whispering gallery. The whispering gallery is 
an elliptical shaped room, and as such the room has two focal points. Sound emanating from one focal 
point is bounced around the room and is reconstructed in phase and amplitude at the other focus. Thus, 
a seemfingly private conversation initiated unwittingly at one focus, that is essentially inaudible at most 
everywhere else in the whispering gallery, is reconstructed and is perfectly audible at the other focus 
some distance away. The advantage of the electronic embodiment of this concept is that there can be 
multiple focal points or targets, target locations can be created or eliminated dynamically, the volume or 
sound intensity at each target can be made essentially independent of the other targets and the program 
content (what is heard) at each target can be made essentially independent of the other targets. 
Furftermore, ; the targets can be moved dynamically within the listening space and with some additional 
effort can be made to track with independent listeners within the room* 
Theory df operation: 

The first type of system discussed wilibe referred to as a base-band system. In the base-band system it 
is the actual audio program that is focused into a specific target location. (Please refer to Figure t) This 
is the block diagram of the simplest embodiment of the process which is a base-band, single target 
focus, single audio program. 
A to, Df. 
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The analog audio program source is digitized by the A to D bfock in Figure 1. The sampling need only 
satisfy the Nyquist criteria for the audio spectrum audible to human beings. Since the top end of human 
hearing is 20 Khz, the Nyqust criteria would dictate a sampling rate of 40 Khz, There is no need to over- 
sample as the signal is only subjected to a simple delay algorithm and is low pass filtered in the analog 
domain. This embodiment could use a 44 Kh2 sampling rate. This is a common Pulse Code Modulation 
(PCM) ['.WAV] file format sampling rate used in readily available sound card technology. 

PCM Wave Form Buffer : 

This buffer receives a constant stream of data. Each piece of data is a Pulse Code Modulated (PCM) 
wave form sample. Each sample is roughly 22 uSec newer than the previous sample at the above 
mentioned sampling rate. The buffer is a simple shift register, such that with the arrival of each new 
PCM wave form sample, the previous sample is shifted by an offset of one into the buffer. The net effect 
is to produce a delay line of PCM waveform samples whereby the older the sample is in time, the larger 
the offset is into the buffer. The buffer is the means by which delay is entered into wave form signal that 
is ultimately applied to the output speakers. The size of the buffer need only be as large as the number 
of samples that would satisfy the following equation: 
Buffer size (in samples) = R*, x (d m /Vi) 
Where: 

dm is the maximum distance that sound may have to travel for a given listening 
room size in meters (m), 

V| is the speed of sound in air in meters per second (m/s). 

Rs is the sample rate of the PCM wave form In samples per second (sample/sec) 

For an average 12' by 12' room and a common sampling rate of 44 Khz the size of the buffer works out 
to be 51 1 samples long. 512 is a nice binary size that would be used in the preferred embodiment 

The Data Pointer Offset Arrav: 

There is a one to one correspondence between each element of the Data Pointer Offset Array and a 
speaker in the NxN Speaker Array of Figure 1. The correspondence is such that each element of the 
Data Pointer Offset Array holds a pointer to a specific address in the PCM waveform buffer for a given 
speaker in the NxN Speaker Array. In other words the k * element of the Data Pointer Offset Array will 
be uniquely associated with the speaker Sg, the speaker in the i* row and j* column of the NxN speaker 
array. The specific address held in the Data Pointer Offset Array represents an offset into the PCM 
waveform buffer. This offset represents the specific time delay that is introduced in the audio program 
signal that is sent to that specific speaker. Any given data element of the pointer offset array will hold an 
offset into the PCM waveform buffer that is given by the fbllowing formula: 
Offset- R*xT«g 
Where: 

Ra is the sample rate of the PCM wave form in samples per second (sample/sec) 

T,aj is the additional time delay (In seconds) necessary at speaker location Sj* . 
It is given by the formula: 

T«J) = Tmt - Tj4J 

Where: 

Tmc is the time delay due to the time of flight at the speed of 
sound over the distance between the target and the furthest 
speaker from the target 

T^ is the time delay due to the time of flight at the speed of 
sound over the distance between the target and the speaker S 9 , 

The Address Generator 

The address generator is actually a function of the program that runs on the CPU. It is shown on the 
block diagram for purposes of clarity. The address refers to the address that is sent to the analog 
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multiplexor (Analog MUX in Figure 1) and is used to route the output of the digital to analog converter 
(DAC in Figure 1) to the correct speaker in the NxN speaker array. 



The CPU: 

The CPU is an embedded processor of some variety that is suitably fast to perform the function listed 
below. In the preferred embodiment a Digital Signal Processor (DSP)would be used. 
The primary loop of the operating program performs the following operations: 

1 . ) Generates an address that selects a specific speaker S ( j in the NxN speaker array. The address is 
generated by means of a simpfe incrementing process that starts from zero and is incremented by one 
at the completion of each loop in the primary loop of operation. 

2. ) Uses the address generated in step one as art offset address into the Data Pointer Offset array to 
indirectly address and read the specific waveform sample held in the PCM waveform buffer into the 
accumulator of the CPU. At this point, the data sample can be scaled up or down for volume control. 

3. ) Outputs the address generated in step one to the output port that is connected to the Analog MUX, 

4. ) Outputs the data now held in the accumulator to the output port that is connected to the DAC. 

5. ) Increments the address from step one by one and loop back to step one. 

The entire loop must be repeated for every speaker and all speakers must receive an output sample 
from the DAC with in the waveform sampling period or 22 uS for the preferred embodiment. 
Use of a digital signar processor can be helpful in combining steps such as the incrementing step so as 
to save computational time. Depending on the size of the speaker array, multiple CPUs may be used to 
maintain computational throughput. Fortunately, since the steps involved in the primary loop are fairly 
rudimentary the processors needed can be of an inexpensive type. 

The pointers held in the Data Pointer Offset array are calculated based upon the known location of the 
target in relation to the speaker array. If the target focus location is fixed, the data pointers in this array 
need only be calculated once. If the target focus location is moved, the pointers in this array need to be 
updated. If the target is to be tracked while it is in motion, the pointers In this array must be updated 
continuously. 

The speaker array: 

The speaker array in the preferred embodiment consists of an NxN array of small (2 to 3 inch diameter) 
high quality audio speakers that are positioned in a rectilinear grid pattern in the ceiling of the subject 
listening room. The speakers need to be selected for their ability to be non-resonant, that Is a frequency 
response that is reasonably flat in the passband of interest ( in this case the passband of human 
hearing) and non-directional so as to provide a hemispheric wave front or a reasonable facsimile thereof 

(Please refer to Figure 2) This is the block diagram of a slightly more elaborate embodiment of the 
process which is a base-band, multiple target focus, single audio program. 

In this situation, everything is the same as for the simplest embodiment except there are multiple Data 
Pointer Offset Arrays. There is one array for each target focus. Each array holds a set of pointers that 
produce the specific delays that will focus the sound at a predetermined location in the listening area. In 
this case the CPU must indirectly address and read a PCM wave form sample through each of the 
multiple Data Pointer Offset Arrays for each speaker location. The multiple samples are summed and 
output to the DAC. Multiple targets are created by a simple process of superposition of the waveform 
samples on the array of speakers; 

(Please refer to Figure 3) This is the block diagram of a stitr more elaborate embodiment of the process 
which is a base-band, multiple target focus, multiple audio program. 

In this case everything is the same as the system pictured in Figure 2 except there are multiple PCM 
Wave form buffers. The waveform samples for a given target location are indirectly accessed from the 
waveform buffer that holds the desired audio program. This is a simple matter of setting a different base 
address when performing the indirect address through the data pointer offset array, 
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(Please refer to Figure 4) This is the block diagram of a complete embodiment of the process which is a 
base band, multiple target focus, multiple audio program with listener tracking. 

The tracking system that is described here is by no means the only tracking system that could be used. 
Other listener tracking system that might be used are simply alternate embodiments of the same 

system. 

Hare a tracking system is Implemented that creates an interrogating target focus. The interrogating 
target focus is a sub-audible tone pattern that is localized in three dimensional space and continuously 
scanned through the listening room. Listeners would be required to wear a wireless microphone that was 
omnidirectional and sensitive to the frequencies in the tone pattern, A listener's location within the 
listening room is fixed in relation to the speaker array by sensing the time at which the signal from a 
given microphone reaches its maximum. The location of the interrogating target focus is known by the 
contents of the data pointer offset array for the interrogating tone. At the time when the maximum is 
sensed, a copy of the data pointer offset array for the interrogating tone is placed In the array for the 
listener target 

It should be noted that this method reduces the computational load of the CPU since it eliminates the 
need to calculate the delays to be programmed into the Data Pointer Offset Array as the target locations 
are identified empirically as the interrogating tone is scanned through the rOQm. The process of scanning 
a tone target is a simple matter of incrementing hi a predetermined fashion the data elements of the 
Data Pointer Offset Array for the target focus to be scanned through the room. 
The disadvantage of this method is that the listeners are still required to wear some kind of appliance ( 
i.e. wireless microphone). It would be best if the items are worn high on body, the shoulder or collar 
would be optimal. In light of current technology and the power requirements for wireless transmission, it 
is conceivable that the item could be made to look like a small pin or jewelry. 
Alternate embodiment 

The base band system has some minor drawbacks. One of these is that the reconstruction of lower 
wavelengths occurs over a larger volume of physical space than higher wavelengths so that the area of 
increased sound pressure may overlap into another target area, if the target spacing is not held to 
specific limits. Another is that the dynamic range of the audio program may be limited if the gcal of 
relative inaudibility te sought for non-target focationst Even though the signal is likely to be unintelligible, 
the sound intensity at frequencies in the audible range is proportional to the number of speakers in the 
array and the aggregate power output levei of the speaker array. These issues can be minimized to a 
great degree through implementation of the alternate embodiment of the phased array sound system. In 
this embodiment, the audio program is directed to the target location via a phase modulated ultrasonic 
carter wave. The modulated carrier is of the form *Acos (w e t+ 0)" where A is the amplitude, W e is the 
frequency %k% and 9 is varied between -n/2 and (in radians). The signal that modulates the earner 
wave is the audio program signal that is desired to be heard at the target location. The audio program 
signal is then demodulated at the target location in free air through the process of heterodyning with an 
unmodulated carrier at the same frequency and zero phase displacement. This is achieved by placing, at 
the target location, a pure sinusoidal wave of the form Acos (w« x). It is important that the frequency of 
both sinusoidal signals is exactly the same and that the unmodulated wave fofm has 0 phase offeet in 
relation to the modulated waveform when 0 in the modulated wave fbrm is 0. This heterodyning results in 
a translation from phase modulation to amplitude modulation. The size of the respective signals are 
controlled to ensure that the amplitude modulation is maintained at or beiow 100%. The original base 
band audio program is recovered through the use of square law effect amplification to introduce a non- 
linearity in the amplitude modulated signal. The non-tinearity is achieved through the process of 
transducing the 40khz electrical signal to ultrasound. During this process, square law amplifiers are used 
to introduce a small non-linearity such that the sinusoids are not perfectly symmetrical. The resulting 
asymmetry will produce a wave form tor the modulated (as well as the unmodulated) waveform that has 
positive (negative) excursions that are slightly larger than the negative (positive) ones (see Figure 5). 
once these conditions are mst at the target location, the result is an average signal that follows the 
envelope of the modulated signal. Since the amplitude modulated signal is on a carrier wave that is well 
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above the audible range, the human ear will provide what amounts to a low pass filtering of the average 
envelope of the amplitude modulated signal. The sound perceived at the target location is that of the 
original base-band audio program. 

Greater dynamic range in the reconstruction of the signal is possible due to the fact that sound pressure 
In the non-target locations has the vast majority of its signal energy at frequencies well above the range 
of human hearing. The base-band signal is reconstructed in the very localized volume of the target focus 
location. If the carrier wave is 40 Khz, toe target focus is a spherical volume of roughly .33 inches in 
diameter. Thus, target locations can be positioned much more closely with no overlap. The air in this 
volume is compressed and rarefied at all the signal frequencies contained in the amplitude modulated 
signal. Air outside this focus location receives energy from the focus in much the same way as a speaker 
imparts sound energy to the air in its immediate vicinity. The ultrasonic transducers that produce the 
sound can be run at very high signal output intensities, (i.e, 160dB) Even though it is unlikely that such 
high sound output powers would actually be used, it is an indication that significant dynamic range output 
power could be achieved, 

A course implementation of this scheme is pictured in Figures 6 and 7, Figure 7 shows a system very 
Similar to that pictured in Figure 4 with the exception that the speaker array has been replaced with an 
array of ultrasonic transducers. Communication to each element of the array is now done via a digital 
serial link and there is a high accuracy radio frequency synchronizing pulse that is used to ensure 
accurate phase alignment for all of the transducer elements of the array. Figure 6 shows a typical array 
element which has three major components. The components are an application specific Integrated 
circuit a square law amplifier and an ultrasonic transducer. The application specific 1C produces the 
phase modulated and/or unmodulated electrical signal that will have the proper phase relationship to the 
other transducers in the array so as to produce the desired aiming of the signal into the listening space. It 
is comprised of: 1)a controller that receives digital data from a high speed serial port that is addressed to 
this specific transducer location. 2) Single element data buffers that hold pointer offsets into the carrier 
wave buffer These offsets are updated by the data received from the serial port. 3) A time base that is 
phase locked to the synchronizing signal. 4) A waveform buffer that contains pulse code modulated 
waveform samples that represents one full cycle of the carrier wave sinusoid. 5) A summing function 
that produces a net pointer into the carrier wave sinusoid buffer and 6) a summing function that sums the 
amplitude of the carrier wave sample for each target. 7) a digrtal-to-analog converter. The square law 
amplifier provides the bias voltage and driving voltage for the ultrasonic transducer. It also introduces a 
small non-linearity into the signal such that the carrier wave is slightly asymmetrical. It should be noted 
that the non-linearity may be encoded into the PCM waveform samples of the carrier wave sinusoid thus 
eliminating the need for a special non-linear amplifier but the square law amplifier is shown here for 
clarity. An amplifier is needed m any case to provide the special bias and driving voltages for the 
transducer. The ulfrasonic transducer is a commercially available (Polaroid) electrostatic transducer. In 
this case a 7000 series. It is operated near its peak efficiency of 55Khz. The aperture plate is employed 
to act as a beam spreader. Without the aperture plate the beam pattern from the transducer is highly 
directional, A 50% decrease of the aperture produces -a doubling of the beam angle. The net effect of the 
aperture (once properly sized) is to act like a pin hole lens and produce a hemispherical wave front to as 
great a degree as possible, 

The operation of the transducer array element is as follows; in the base-band system the audio program 
signal had actual delays introduced for specific array speakers to produce convergence of the signal at a 
specific point fn the listening room. In order for the earner wave to converge at a specific target location! 
a similar concept is employed, however, for a continuous sinusoid this delay reduces down to a whole 
number of cycles and a residual phase, it is only necessary to modify the phase of the output sinusoid for 
each element of the array in order to produce constructive superposition at the desired target location for 
the unmodulated sinusoid. The time base is a phase-locked loop, locked to the time base synchronizing 
signal. It is used to increment a pointer through the carrier wave sinusoid PCM buffer. All elements of the 
array increment to the next PCM sample in lock with the synchronizing signal. The speed Of 
incrementing is such that the entire buffer is strobed through at a rate of 55Khz, The residual phase 
necessary for this particular element is read from the target phase offset buffer. The amount of phase 
angle offset is in the form of a signed integer number of samples. Thus, the number read from the 
D Target/channel modulation phase offset single data element buffer* is summed with the pointer 
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produced by the time base and is used to retrieve the PCM sample from the carrier wave sinusoid 
buffer. Audio program phase modulation of the carrier wave is incorporated into the target phase offset 
with the phase modulation offset presented to the summing function delayed in time by an amount 
dictated by the central CPU of Figure 7. Multiple targets and multiple channels are for each element 
achieved by a simple process 6f summing the samples that are mapped to this particular transducer 
element of the array. 

The phase output of each element of the anay, when considered across the entire array, can be thought 
of as a time varying two dimenstonail Fourier Transform representing the special frequencies of the 
target locations. The array is, in essence, a phase-encoded hologram where the reference/playback 
beam is the ultrasonic carrier wave. 

Figure 7 provides the target location phase offset calculation and audio program array element delays 
for each element of the array. The computational load is only slightly larger for the CPU ip thi$ system as 
compared to the baseband system since the delay offset pointer for each array element must be 
converted into a signed integer number of carrier wave samples before being transmitted to the 
particular array element thai is mapped to that pointer. Since the targets in question don't move with 
great rapidity in relation to the computational reaction time of the CPU, this relationship of carrier wave 
offset samples to data pointer offsets (as pictured in Figure 7) is treated as a lookup table. In this case, 
the data pointer offset array points to a data element that is the sum the PCM waveform buffer sample 
(that has been converted to a carrierwave phase offset number) and the target phase offset( that has 
also been converted to a carrierwave offset number) . An overall communications efficiency is realized 
by allowing the summing to occur at the central CPU so that only one number per channel per ASIC 
needs to be transmitted to each transducer in the array. In other words if 3 active targets/channels are 
being transmitted to in the listening room then the central CPU must update each ASIC.at each 
transducer location in the array with 3 offset numbers. 
Listener tracking is done in an identical fashion to the base-band system. 



KNOWN PRIOR PUBLICATION: 

None relating to the production of "real Image 9 of sound. 

White there is very active work in the area of "3D" sound; these efforts are all aimed at exploiting the 
psycho-acoustic effects of time and phase relationships produced by stereo and Dolby type systems. 
These systems take advantage of the biophysical phenomena Iniemaural Time Differences (1TD), 
Interaural Intensity Differences (IID) as well as Head Related Transfer function Function (HRTF) to 
create "virtual images' of sound. 

ADVANTAGES OVER OTHER PRIOR ART: 

Current and prior art related to public address as well as other sound reproduction systems has 
concentrated on creating a homogenousc soundfield even when specific elements of an audio program 
were to have an identifiable location in the soundfield, it was intended to be heard by all in the same 
listening room. Aside from the use of headphones, private listening v/as always a matter of keeping the 
volume down. Prior^rt has never made available distinct and controllable volume levels for individuals in 
the same listening room. Aside from the use of headphones, prior art has never been able to provide 
different audio programs to the inhabitants of the same listening room. As examples of the usefulness of 
this process, consider multilingual school rooms or auditoriums where listeners if property equipped 
(using a system with listener tracking) or seated in the proper location (In a non-tracking type system), 
might hear the presenter in his or her own language without the use of cumbersome headphones. In the 
entertainment realm, it might be possible to have both edited and non-edited versions of motion pfcture 
film cfialog presented to the sarnie audience at the same time. Entirely different plot lines may be able to 
be presented to different portions of the audience. In public places, one Juke box might be able to play 
several different songs at the same time. Hands-free phone operation might be achieved in open office 
environments while still maintaining private conversation. Buildings so equipped could take advantage of 
listener tracking to automatically route telephone and intercom signals to the desired recipient without 
the need of a handset. The conversation could be full duplex if the tracking system were of the wireless 
microphone type, 
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